Locating Data Sources in Large Distributed Systems
نویسندگان
چکیده
Querying large numbers of data sources is gaining importance due to increasing numbers of independent data providers. One of the key challenges is executing queries on all relevant information sources in a scalable fashion and retrieving fresh results. The key to scalability is to send queries only to the relevant servers and avoid wasting resources on data sources which will not provide any results. Thus, a catalog service, which would determine the relevant data sources given a query, is an essential component in efficiently processing queries in a distributed environment. This paper proposes a catalog framework which is distributed across the data sources themselves and does not require any central infrastructure. As new data sources become available, they automatically become part of the catalog service infrastructure, which allows scalability to large numbers of nodes. Furthermore, we propose techniques for workload adaptability. Using simulation and real-world data we show that our approach is valid and can scale to thousands of data sources.
منابع مشابه
Optimal Placement of DGs in Distribution System including Different Load Models for Loss Reduction using Genetic Algorithm
Distributed generation (DG) sources are becoming more prominent in distribution systems due to the incremental demands for electrical energy. Locations and capacities of DG sources have great impacts on the system losses in a distribution network. This paper presents a study aimed for optimally determining the size and location of distributed generation units in distribution systems with differ...
متن کاملOptimal Placement of DGs in Distribution System including Different Load Models for Loss Reduction using Genetic Algorithm
Distributed generation (DG) sources are becoming more prominent in distribution systems due to the incremental demands for electrical energy. Locations and capacities of DG sources have great impacts on the system losses in a distribution network. This paper presents a study aimed for optimally determining the size and location of distributed generation units in distribution systems with differ...
متن کاملData Model and Query Evaluation inGlobal Information
Global information systems involve a large number of information sources distributed over computer networks. The variety of information sources and disparity of interfaces makes the task of easily locating and eeciently accessing information over the network very cumbersome. We describe an architecture for global information systems that is especially tailored to address the challenges raised i...
متن کاملA Multiagent-based Framework for Integrating Biological Data
Biological data has been rapidly increasing in volume in different Web data sources. To query multiple data sources manually on the internet is time consuming for biologists. Therefore, systems and tools that facilitate searching multiple biological data sources are needed. Traditional approaches to build distributed or federated systems do not scale well to the large, diverse, and the growing ...
متن کاملE2DR: Energy Efficient Data Replication in Data Grid
Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...
متن کامل